Genome-wide identification and predictive modeling of tissue-specific alternative polyadenylation
نویسندگان
چکیده
MOTIVATION Pre-mRNA cleavage and polyadenylation are essential steps for 3'-end maturation and subsequent stability and degradation of mRNAs. This process is highly controlled by cis-regulatory elements surrounding the cleavage/polyadenylation sites (polyA sites), which are frequently constrained by sequence content and position. More than 50% of human transcripts have multiple functional polyA sites, and the specific use of alternative polyA sites (APA) results in isoforms with variable 3'-untranslated regions, thus potentially affecting gene regulation. Elucidating the regulatory mechanisms underlying differential polyA preferences in multiple cell types has been hindered both by the lack of suitable data on the precise location of cleavage sites, as well as of appropriate tests for determining APAs with significant differences across multiple libraries. RESULTS We applied a tailored paired-end RNA-seq protocol to specifically probe the position of polyA sites in three human adult tissue types. We specified a linear-effects regression model to identify tissue-specific biases indicating regulated APA; the significance of differences between tissue types was assessed by an appropriately designed permutation test. This combination allowed to identify highly specific subsets of APA events in the individual tissue types. Predictive models successfully classified constitutive polyA sites from a biologically relevant background (auROC = 99.6%), as well as tissue-specific regulated sets from each other. We found that the main cis-regulatory elements described for polyadenylation are a strong, and highly informative, hallmark for constitutive sites only. Tissue-specific regulated sites were found to contain other regulatory motifs, with the canonical polyadenylation signal being nearly absent at brain-specific polyA sites. Together, our results contribute to the understanding of the diversity of post-transcriptional gene regulation. AVAILABILITY Raw data are deposited on SRA, accession numbers: brain SRX208132, kidney SRX208087 and liver SRX208134. Processed datasets as well as model code are published on our website: http://www.genome.duke.edu/labs/ohler/research/UTR/. CONTACT [email protected].
منابع مشابه
Genome-wide mapping of polyadenylation sites in fission yeast reveals widespread alternative polyadenylation
Regulatory elements in the 3' untranslated regions (UTRs) of eukaryotic mRNAs influence mRNA localization, translation, and stability. 3'-UTR length is determined by the location at which mRNAs are cleaved and polyadenylated. The use of alternative polyadenylation sites is common, and can be regulated in different situations. I present a new method to identify cleavage and polyadenylation sites...
متن کاملIdentification of allele-specific alternative mRNA processing via transcriptome sequencing
Establishing the functional roles of genetic variants remains a significant challenge in the post-genomic era. Here, we present a method, allele-specific alternative mRNA processing (ASARP), to identify genetically influenced mRNA processing events using transcriptome sequencing (RNA-Seq) data. The method examines RNA-Seq data at both single-nucleotide and whole-gene/isoform levels to identify ...
متن کاملComprehensive Polyadenylation Site Maps in Yeast and Human Reveal Pervasive Alternative Polyadenylation
The emerging discoveries on the link between polyadenylation and disease states underline the need to fully characterize genome-wide polyadenylation states. Here, we report comprehensive maps of global polyadenylation events in human and yeast generated using refinements to the Direct RNA Sequencing technology. This direct approach provides a quantitative view of genome-wide polyadenylation sta...
متن کاملComputational analysis of 3'-ends of ESTs shows four classes of alternative polyadenylation in human, mouse, and rat.
Alternative initiation, splicing, and polyadenylation are key mechanisms used by many organisms to generate diversity among mature mRNA transcripts originating from the same transcription unit. While previous computational analyses of alternative polyadenylation have focused on polyadenylation activities within or downstream of the normal 3'-terminal exons, we present the results of the first g...
متن کاملIdentification of human-specific transcript variants induced by DNA insertions in the human genome
MOTIVATION Many genes in the human genome produce a wide variety of transcript variants resulting from alternative exon splicing, differential promoter usage, or altered polyadenylation site utilization that may function differently in human cells. Here, we present a bioinformatics method for the systematic identification of human-specific novel transcript variants that might have arisen after ...
متن کامل